-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Support Qwen3-Coder-480B-A35B-Instruct on Google Vertex AI #7733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and have some suggestions for improvement to maintain consistency with the existing codebase.
| outputPrice: 1.15, | ||
| description: "Meta Llama 4 Maverick 17B Instruct model, 128K context.", | ||
| }, | ||
| "qwen/qwen3-coder-480b-a35b-instruct-maas": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the model ID uses a namespace prefix qwen/ while the other MaaS model llama-4-maverick-17b-128e-instruct-maas doesn't use any prefix. Is this intentional? Should we maintain consistency across MaaS models, or is the namespace required for Qwen models specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional. The model name to be passed to vertex ai is "qwen/qwen3-coder-480b-a35b-instruct-maas".
ENDPOINT=us-south1-aiplatform.googleapis.com
REGION=us-south1
PROJECT_ID="YOUR_PROJECT_ID"
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" https://${ENDPOINT}/v1/projects/${PROJECT_ID}/locations/${REGION}/endpoints/openapi/chat/completions \
-d '{"model":"qwen/qwen3-coder-480b-a35b-instruct-maas", "stream":true, "messages":[{"role": "user", "content": "Write a quick sort algorithm"}]}'
| supportsPromptCache: false, | ||
| inputPrice: 1, | ||
| outputPrice: 4, | ||
| description: "Qwen 3 Coder 480B Instruct", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the description more detailed to match the pattern of other models? Consider:
| description: "Qwen 3 Coder 480B Instruct", | |
| description: "Qwen 3 Coder 480B Instruct model, 262K context.", |
This would be consistent with the Llama model description above which includes the context window size.
|
Here is the model card: https://cloud.google.com/vertex-ai/generative-ai/docs/maas/qwen/qwen3-coder |
|
Note: There's a more comprehensive PR that includes this model and others already pending: PR #7727 |
|
Agree. #7727 is superset of this PR. |
|
Closed in favor of #7727 |
Description
Support Qwen3-Coder-480B-A35B-Instruct on Google Vertex AI
Important
Add support for
qwen/qwen3-coder-480b-a35b-instruct-maasmodel andus-south1region in Vertex AI configuration.qwen/qwen3-coder-480b-a35b-instruct-maastovertexModelsinvertex.tswith 65,536 max tokens, 262,144 context window, input price 1, output price 4.us-south1toVERTEX_REGIONSinvertex.ts.This description was created by
for 4c5d79e. You can customize this summary. It will automatically update as commits are pushed.